Construct Non-Hierarchical P/NBD Model for CD-Now Transaction Data

Author

Mick Cooney

Published

July 16, 2023

In this workbook we construct the non-hierarchical P/NBD models on the CD-NOW transaction data.

1 Load and Construct Datasets

We start by modelling the P/NBD model using our synthetic datasets before we try to model real-life data.

Show code
use_fit_start_date <- as.Date("1997-01-01")
use_fit_end_date   <- as.Date("1997-12-31")

use_valid_start_date <- as.Date("1998-01-01")
use_valid_end_date   <- as.Date("1998-07-01")

1.1 Load CD-NOW Transaction Data

We now want to load the CD-NOW transaction data.

Show code
customer_cohortdata_tbl <- read_rds("data/cdnow_customer_cohort_data_tbl.rds")
customer_cohortdata_tbl |> glimpse()
Rows: 23,570
Columns: 5
$ customer_id     <chr> "00001", "00002", "00003", "00004", "00005", "00006", …
$ cohort_qtr      <chr> "1997 Q1", "1997 Q1", "1997 Q1", "1997 Q1", "1997 Q1",…
$ cohort_ym       <chr> "1997 01", "1997 01", "1997 01", "1997 01", "1997 01",…
$ first_tnx_date  <date> 1997-01-01, 1997-01-12, 1997-01-02, 1997-01-01, 1997-…
$ total_tnx_count <int> 1, 2, 6, 4, 11, 1, 3, 8, 3, 1, 4, 1, 1, 1, 1, 4, 1, 1,…
Show code
customer_transactions_tbl <- read_rds("data/cdnow_transaction_data_tbl.rds")
customer_transactions_tbl |> glimpse()
Rows: 69,659
Columns: 8
$ customer_id   <chr> "00001", "00002", "00002", "00003", "00003", "00003", "0…
$ tnx_date      <date> 1997-01-01, 1997-01-12, 1997-01-12, 1997-01-02, 1997-03…
$ tnx_timestamp <dttm> 1997-01-01 21:57:19, 1997-01-12 06:52:02, 1997-01-12 22…
$ tnx_dow       <fct> Wed, Sun, Sun, Thu, Sun, Wed, Sat, Tue, Thu, Wed, Sat, S…
$ tnx_month     <fct> Jan, Jan, Jan, Jan, Mar, Apr, Nov, Nov, May, Jan, Jan, A…
$ tnx_week      <chr> "00", "01", "01", "00", "12", "13", "45", "47", "21", "0…
$ cd_count      <int> 1, 5, 1, 2, 2, 2, 5, 4, 1, 2, 2, 1, 2, 2, 1, 3, 3, 3, 2,…
$ total_spend   <dbl> 11.77, 77.00, 12.00, 20.76, 20.76, 19.54, 57.45, 20.96, …
Show code
customer_subset_id <- read_rds("data/cdnow_customer_subset_ids.rds")
customer_subset_id |> glimpse()
 chr [1:2000] "00014" "00018" "00032" "00061" "00064" "00076" "00081" ...

We re-produce the visualisation of the transaction times we used in previous workbooks.

Show code
plot_tbl <- customer_transactions_tbl |>
  group_nest(customer_id, .key = "cust_data") |>
  filter(map_int(cust_data, nrow) > 3) |>
  slice_sample(n = 30) |>
  unnest(cust_data)

ggplot(plot_tbl, aes(x = tnx_timestamp, y = customer_id)) +
  geom_line() +
  geom_point() +
  labs(
      x = "Date",
      y = "Customer ID",
      title = "Visualisation of Customer Transaction Times"
    ) +
  theme(axis.text.y = element_text(size = 10))

1.2 Load Derived Data

Show code
obs_fitdata_tbl   <- read_rds("data/cdnow_obs_fitdata_tbl.rds")
obs_validdata_tbl <- read_rds("data/cdnow_obs_validdata_tbl.rds")

customer_fit_stats_tbl <- obs_fitdata_tbl |>
  rename(x = tnx_count)

1.3 Load Subset Data

We also want to construct our data subsets for the purposes of speeding up our valuations.

Show code
customer_fit_subset_tbl <- obs_fitdata_tbl |>
  filter(customer_id %in% customer_subset_id)

customer_fit_subset_tbl |> glimpse()
Rows: 2,000
Columns: 6
$ customer_id    <chr> "00014", "00018", "00032", "00061", "00064", "00076", "…
$ first_tnx_date <dttm> 1997-01-01 21:18:22, 1997-01-04 18:49:57, 1997-01-01 1…
$ last_tnx_date  <dttm> 1997-01-01 21:18:22, 1997-01-04 18:49:57, 1997-01-24 1…
$ tnx_count      <dbl> 0, 0, 1, 0, 8, 0, 13, 12, 0, 2, 0, 6, 0, 0, 1, 1, 0, 0,…
$ t_x            <dbl> 0.0000000, 0.0000000, 3.3184159, 0.0000000, 51.5902377,…
$ T_cal          <dbl> 51.87318, 51.45933, 51.92642, 51.97308, 51.68499, 51.97…
Show code
customer_valid_subset_tbl <- obs_validdata_tbl |>
  filter(customer_id %in% customer_subset_id)

customer_valid_subset_tbl |> glimpse()
Rows: 2,000
Columns: 3
$ customer_id       <chr> "00014", "00018", "00032", "00061", "00064", "00076"…
$ tnx_count         <dbl> 0, 0, 1, 0, 4, 0, 0, 6, 2, 2, 0, 1, 0, 0, 0, 0, 0, 0…
$ tnx_last_interval <dbl> NA, NA, 9.945594, NA, 25.052557, NA, NA, 25.303557, …

2 Fit First P/NBD Model

We now construct our Stan model and prepare to fit it with our synthetic dataset.

Before we start on that, we set a few parameters for the workbook to organise our Stan code.

Show code
stan_modeldir <- "stan_models"
stan_codedir  <-   "stan_code"

We also want to set a number of overall parameters for this workbook

To start the fit data, we want to use the 1,000 customers. We also need to calculate the summary statistics for the validation period.

2.1 Compile and Fit Stan Model

We now compile this model using CmdStanR.

Show code
pnbd_fixed_stanmodel <- cmdstan_model(
  "stan_code/pnbd_fixed.stan",
  include_paths =   stan_codedir,
  pedantic      =           TRUE,
  dir           =  stan_modeldir
  )

We then use this compiled model with our data to produce a fit of the data.

Show code
stan_modelname <- "pnbd_cdnow_fixed1"
stanfit_seed   <- stanfit_seed + 1
stanfit_prefix <- str_c("fit_", stan_modelname) 

stanfit_object_file <- glue("data/{stanfit_prefix}_stanfit.rds")

stan_data_lst <- customer_fit_stats_tbl |>
  select(customer_id, x, t_x, T_cal) |>
  compose_data(
    lambda_mn = 0.25,
    lambda_cv = 1.00,
    
    mu_mn     = 0.10,
    mu_cv     = 1.00,
    )

if(!file_exists(stanfit_object_file)) {
  pnbd_cdnow_fixed1_stanfit <- pnbd_fixed_stanmodel$sample(
    data            =                stan_data_lst,
    chains          =                            4,
    iter_warmup     =                          500,
    iter_sampling   =                          500,
    seed            =                 stanfit_seed,
    save_warmup     =                         TRUE,
    output_dir      =                stan_modeldir,
    output_basename =               stanfit_prefix,
    )
  
  pnbd_cdnow_fixed1_stanfit$save_object(stanfit_object_file, compress = "gzip")

} else {
  pnbd_cdnow_fixed1_stanfit <- read_rds(stanfit_object_file)
}

pnbd_cdnow_fixed1_stanfit$summary()
# A tibble: 70,711 × 10
   variable       mean   median      sd     mad       q5      q95  rhat ess_bulk
   <chr>         <num>    <num>   <num>   <num>    <num>    <num> <num>    <num>
 1 lp__       -2.87e+5 -2.87e+5 1.66e+2 1.56e+2 -2.87e+5 -2.87e+5 1.00      669.
 2 lambda[1]   1.40e-1  8.41e-2 1.65e-1 9.44e-2  4.65e-3  4.57e-1 1.00     3336.
 3 lambda[2]   3.40e-1  2.72e-1 2.78e-1 2.16e-1  4.17e-2  8.98e-1 1.00     3265.
 4 lambda[3]   8.97e-2  8.46e-2 4.13e-2 3.78e-2  3.54e-2  1.69e-1 1.00     3719.
 5 lambda[4]   7.19e-2  6.65e-2 3.52e-2 3.19e-2  2.45e-2  1.38e-1 1.00     3456.
 6 lambda[5]   1.80e-1  1.75e-1 5.79e-2 5.69e-2  9.26e-2  2.86e-1 0.999    3219.
 7 lambda[6]   1.39e-1  7.69e-2 1.68e-1 8.78e-2  4.82e-3  4.75e-1 1.00     2376.
 8 lambda[7]   3.81e-2  3.20e-2 2.69e-2 2.27e-2  6.33e-3  9.07e-2 1.00     3663.
 9 lambda[8]   1.25e-1  1.20e-1 4.80e-2 4.55e-2  5.81e-2  2.15e-1 1.01     4074.
10 lambda[9]   6.21e-2  4.81e-2 5.10e-2 3.86e-2  9.03e-3  1.63e-1 0.999    3331.
# ℹ 70,701 more rows
# ℹ 1 more variable: ess_tail <num>

We have some basic HMC-based validity statistics we can check.

Show code
pnbd_cdnow_fixed1_stanfit$cmdstan_diagnose()
Processing csv files: /home/rstudio/btydwork/stan_models/fit_pnbd_cdnow_fixed1-1.csvWarning: non-fatal error reading adaptation data
, /home/rstudio/btydwork/stan_models/fit_pnbd_cdnow_fixed1-2.csvWarning: non-fatal error reading adaptation data
, /home/rstudio/btydwork/stan_models/fit_pnbd_cdnow_fixed1-3.csvWarning: non-fatal error reading adaptation data
, /home/rstudio/btydwork/stan_models/fit_pnbd_cdnow_fixed1-4.csvWarning: non-fatal error reading adaptation data


Checking sampler transitions treedepth.
Treedepth satisfactory for all transitions.

Checking sampler transitions for divergences.
No divergent transitions found.

Checking E-BFMI - sampler transitions HMC potential energy.
E-BFMI satisfactory.

Effective sample size satisfactory.

Split R-hat values satisfactory all parameters.

Processing complete, no problems detected.

2.2 Visual Diagnostics of the Sample Validity

Now that we have a sample from the posterior distribution we need to create a few different visualisations of the diagnostics.

Show code
parameter_subset <- c(
  "lambda[1]", "lambda[2]", "lambda[3]", "lambda[4]",
  "mu[1]",     "mu[2]",     "mu[3]",     "mu[4]"
  )

pnbd_cdnow_fixed1_stanfit$draws(inc_warmup = FALSE) |>
  mcmc_trace(pars = parameter_subset) +
  expand_limits(y = 0) +
  labs(
    x = "Iteration",
    y = "Value",
    title = "Traceplot of Sample of Lambda and Mu Values"
    ) +
  theme(axis.text.x = element_text(size = 10))

We also check \(N_{eff}\) as a quick diagnostic of the fit.

Show code
pnbd_cdnow_fixed1_stanfit |>
  neff_ratio(pars = c("lambda", "mu")) |>
#  as.numeric() |>
  mcmc_neff() +
    ggtitle("Plot of Parameter Effective Sample Sizes")

2.3 Assess the Model

As we intend to run the same logic to assess each of our models, we have combined all this logic into a single function run_model_assessment, to run the simulations and combine the datasets.

We first run the assessment data.

Show code
pnbd_stanfit <- pnbd_cdnow_fixed1_stanfit |>
  recover_types(customer_fit_stats_tbl)

pnbd_cdnow_fixed1_assess_data_lst <- run_model_assessment(
  model_stanfit       = pnbd_stanfit,
  insample_tbl        = customer_fit_subset_tbl,
  fit_label           = "pnbd_cdnow_fixed1",
  fit_end_dttm        = use_fit_end_date     |> as.POSIXct(),
  valid_start_dttm    = use_valid_start_date |> as.POSIXct(),
  valid_end_dttm      = use_valid_end_date   |> as.POSIXct(),
  precompute_rootdir  = "precompute",
  data_dir            = "data",
  summary_include_tnx = FALSE,
  sim_seed            = 10
  )

pnbd_cdnow_fixed1_assess_data_lst |> glimpse()
List of 5
 $ model_fit_index_filepath     : 'glue' chr "data/pnbd_cdnow_fixed1_assess_fit_index_tbl.rds"
 $ model_valid_index_filepath   : 'glue' chr "data/pnbd_cdnow_fixed1_assess_valid_index_tbl.rds"
 $ model_simstats_filepath      : 'glue' chr "data/pnbd_cdnow_fixed1_assess_model_simstats_tbl.rds"
 $ model_fit_simstats_filepath  : 'glue' chr "data/pnbd_cdnow_fixed1_assess_fit_simstats_tbl.rds"
 $ model_valid_simstats_filepath: 'glue' chr "data/pnbd_cdnow_fixed1_assess_valid_simstats_tbl.rds"

2.3.1 Check In-Sample Data Validation

We first check the model against the in-sample data.

Show code
simdata_tbl <- pnbd_cdnow_fixed1_assess_data_lst |>
  use_series(model_fit_simstats_filepath) |>
  read_rds()

insample_plots_lst <- create_model_assessment_plots(
  obsdata_tbl = customer_fit_subset_tbl,
  simdata_tbl = simdata_tbl
  )

insample_plots_lst$multi_plot |> print()

Show code
insample_plots_lst$total_plot |> print()

Show code
insample_plots_lst$quant_plot |> print()

This fit looks reasonable and appears to capture most of the aspects of the data used to fit it. Given that this is a synthetic dataset, this is not surprising, but at least we appreciate that our model is valid.

2.3.2 Check Out-of-Sample Data Validation

We now repeat for the out-of-sample data.

Show code
simdata_tbl <- pnbd_cdnow_fixed1_assess_data_lst |>
  use_series(model_valid_simstats_filepath) |>
  read_rds()

outsample_plots_lst <- create_model_assessment_plots(
  obsdata_tbl = customer_valid_subset_tbl,
  simdata_tbl = simdata_tbl
  )

outsample_plots_lst$multi_plot |> print()

Show code
outsample_plots_lst$total_plot |> print()

Show code
outsample_plots_lst$quant_plot |> print()

3 Fit Alternate Prior Model.

We want to try an alternate prior model with a smaller co-efficient of variation to see what impact it has on our procedures.

Show code
stan_modelname <- "pnbd_cdnow_fixed2"
stanfit_seed   <- stanfit_seed + 1
stanfit_prefix <- str_c("fit_", stan_modelname) 

stanfit_object_file <- glue("data/{stanfit_prefix}_stanfit.rds")


stan_data_lst <- customer_fit_stats_tbl |>
  select(customer_id, x, t_x, T_cal) |>
  compose_data(
    lambda_mn = 0.25,
    lambda_cv = 0.50,
    
    mu_mn     = 0.10,
    mu_cv     = 0.50,
    )

if(!file_exists(stanfit_object_file)) {
  pnbd_cdnow_fixed2_stanfit <- pnbd_fixed_stanmodel$sample(
    data            =                stan_data_lst,
    chains          =                            4,
    iter_warmup     =                          500,
    iter_sampling   =                          500,
    seed            =                 stanfit_seed,
    save_warmup     =                         TRUE,
    output_dir      =                stan_modeldir,
    output_basename =               stanfit_prefix,
    )

  pnbd_cdnow_fixed2_stanfit$save_object(stanfit_object_file, compress = "gzip")

} else {
  pnbd_cdnow_fixed2_stanfit <- read_rds(stanfit_object_file)
}

pnbd_cdnow_fixed2_stanfit$summary()
# A tibble: 70,711 × 10
   variable       mean   median      sd     mad       q5      q95  rhat ess_bulk
   <chr>         <num>    <num>   <num>   <num>    <num>    <num> <num>    <num>
 1 lp__       -7.03e+5 -7.03e+5 1.59e+2 1.56e+2 -7.03e+5 -7.03e+5  1.00     636.
 2 lambda[1]   2.10e-1  1.89e-1 1.11e-1 1.02e-1  7.01e-2  4.27e-1  1.00    4251.
 3 lambda[2]   2.68e-1  2.45e-1 1.27e-1 1.19e-1  1.02e-1  5.16e-1  1.01    3647.
 4 lambda[3]   1.20e-1  1.14e-1 4.20e-2 4.02e-2  6.13e-2  1.94e-1  1.01    4063.
 5 lambda[4]   1.03e-1  9.88e-2 3.95e-2 3.81e-2  4.79e-2  1.73e-1  1.01    4783.
 6 lambda[5]   1.90e-1  1.84e-1 5.16e-2 5.19e-2  1.15e-1  2.83e-1  1.00    5774.
 7 lambda[6]   2.09e-1  1.92e-1 1.05e-1 9.56e-2  7.24e-2  4.14e-1  1.00    4653.
 8 lambda[7]   7.92e-2  7.41e-2 3.59e-2 3.37e-2  2.95e-2  1.44e-1  1.00    4430.
 9 lambda[8]   1.48e-1  1.45e-1 4.62e-2 4.54e-2  7.93e-2  2.30e-1  1.00    4283.
10 lambda[9]   1.25e-1  1.16e-1 5.88e-2 5.35e-2  4.86e-2  2.37e-1  1.00    4263.
# ℹ 70,701 more rows
# ℹ 1 more variable: ess_tail <num>

We have some basic HMC-based validity statistics we can check.

Show code
pnbd_cdnow_fixed2_stanfit$cmdstan_diagnose()
Processing csv files: /home/rstudio/btydwork/stan_models/fit_pnbd_cdnow_fixed2-1.csvWarning: non-fatal error reading adaptation data
, /home/rstudio/btydwork/stan_models/fit_pnbd_cdnow_fixed2-2.csvWarning: non-fatal error reading adaptation data
, /home/rstudio/btydwork/stan_models/fit_pnbd_cdnow_fixed2-3.csvWarning: non-fatal error reading adaptation data
, /home/rstudio/btydwork/stan_models/fit_pnbd_cdnow_fixed2-4.csvWarning: non-fatal error reading adaptation data


Checking sampler transitions treedepth.
Treedepth satisfactory for all transitions.

Checking sampler transitions for divergences.
No divergent transitions found.

Checking E-BFMI - sampler transitions HMC potential energy.
E-BFMI satisfactory.

Effective sample size satisfactory.

Split R-hat values satisfactory all parameters.

Processing complete, no problems detected.

3.1 Visual Diagnostics of the Sample Validity

Now that we have a sample from the posterior distribution we need to create a few different visualisations of the diagnostics.

Show code
parameter_subset <- c(
  "lambda[1]", "lambda[2]", "lambda[3]", "lambda[4]",
  "mu[1]",     "mu[2]",     "mu[3]",     "mu[4]"
  )

pnbd_cdnow_fixed2_stanfit$draws(inc_warmup = FALSE) |>
  mcmc_trace(pars = parameter_subset) +
  expand_limits(y = 0) +
  labs(
    x = "Iteration",
    y = "Value",
    title = "Traceplot of Sample of Lambda and Mu Values"
    ) +
  theme(axis.text.x = element_text(size = 10))

We want to check the \(N_{eff}\) statistics also.

Show code
pnbd_cdnow_fixed2_stanfit |>
  neff_ratio(pars = c("lambda", "mu")) |>
  as.numeric() |>
  mcmc_neff() +
    ggtitle("Plot of Parameter Effective Sample Sizes")

3.2 Assess the Model

As we intend to run the same logic to assess each of our models, we have combined all this logic into a single function run_model_assessment, to run the simulations and combine the datasets.

We first run the assessment data.

Show code
pnbd_stanfit <- pnbd_cdnow_fixed2_stanfit |>
  recover_types(customer_fit_stats_tbl)

pnbd_cdnow_fixed2_assess_data_lst <- run_model_assessment(
  model_stanfit       = pnbd_stanfit,
  insample_tbl        = customer_fit_subset_tbl,
  fit_label           = "pnbd_cdnow_fixed2",
  fit_end_dttm        = use_fit_end_date     |> as.POSIXct(),
  valid_start_dttm    = use_valid_start_date |> as.POSIXct(),
  valid_end_dttm      = use_valid_end_date   |> as.POSIXct(),
  precompute_rootdir  = "precompute",
  data_dir            = "data",
  summary_include_tnx = FALSE,
  sim_seed            = 20
  )

pnbd_cdnow_fixed2_assess_data_lst |> glimpse()
List of 5
 $ model_fit_index_filepath     : 'glue' chr "data/pnbd_cdnow_fixed2_assess_fit_index_tbl.rds"
 $ model_valid_index_filepath   : 'glue' chr "data/pnbd_cdnow_fixed2_assess_valid_index_tbl.rds"
 $ model_simstats_filepath      : 'glue' chr "data/pnbd_cdnow_fixed2_assess_model_simstats_tbl.rds"
 $ model_fit_simstats_filepath  : 'glue' chr "data/pnbd_cdnow_fixed2_assess_fit_simstats_tbl.rds"
 $ model_valid_simstats_filepath: 'glue' chr "data/pnbd_cdnow_fixed2_assess_valid_simstats_tbl.rds"

3.2.1 Check In-Sample Data Validation

We first check the model against the in-sample data.

Show code
simdata_tbl <- pnbd_cdnow_fixed2_assess_data_lst |>
  use_series(model_fit_simstats_filepath) |>
  read_rds()

insample_plots_lst <- create_model_assessment_plots(
  obsdata_tbl = customer_fit_subset_tbl,
  simdata_tbl = simdata_tbl
  )

insample_plots_lst$multi_plot |> print()

Show code
insample_plots_lst$total_plot |> print()

Show code
insample_plots_lst$quant_plot |> print()

This fit looks reasonable and appears to capture most of the aspects of the data used to fit it. Given that this is a synthetic dataset, this is not surprising, but at least we appreciate that our model is valid.

3.2.2 Check Out-of-Sample Data Validation

We now repeat for the out-of-sample data.

Show code
simdata_tbl <- pnbd_cdnow_fixed2_assess_data_lst |>
  use_series(model_valid_simstats_filepath) |>
  read_rds()

outsample_plots_lst <- create_model_assessment_plots(
  obsdata_tbl = customer_valid_subset_tbl,
  simdata_tbl = simdata_tbl
  )

outsample_plots_lst$multi_plot |> print()

Show code
outsample_plots_lst$total_plot |> print()

Show code
outsample_plots_lst$quant_plot |> print()

4 Fit Tight-Lifetime Model

We now want to try a model where we use priors with a tighter coefficient of variation for lifetime but keep the CoV for transaction frequency.

Show code
stan_modelname <- "pnbd_cdnow_fixed3"
stanfit_seed   <- stanfit_seed + 1
stanfit_prefix <- str_c("fit_", stan_modelname) 

stanfit_object_file <- glue("data/{stanfit_prefix}_stanfit.rds")


stan_data_lst <- customer_fit_stats_tbl |>
  select(customer_id, x, t_x, T_cal) |>
  compose_data(
    lambda_mn = 0.25,
    lambda_cv = 1.00,
    
    mu_mn     = 0.10,
    mu_cv     = 0.50,
    )

if(!file_exists(stanfit_object_file)) {
  pnbd_cdnow_fixed3_stanfit <- pnbd_fixed_stanmodel$sample(
    data            =                stan_data_lst,
    chains          =                            4,
    iter_warmup     =                          500,
    iter_sampling   =                          500,
    seed            =                 stanfit_seed,
    save_warmup     =                         TRUE,
    output_dir      =                stan_modeldir,
    output_basename =               stanfit_prefix,
    )

  pnbd_cdnow_fixed3_stanfit$save_object(stanfit_object_file, compress = "gzip")

} else {
  pnbd_cdnow_fixed3_stanfit <- read_rds(stanfit_object_file)
}
  
pnbd_cdnow_fixed3_stanfit$summary()
# A tibble: 70,711 × 10
   variable       mean   median      sd     mad       q5      q95  rhat ess_bulk
   <chr>         <num>    <num>   <num>   <num>    <num>    <num> <num>    <num>
 1 lp__       -5.26e+5 -5.26e+5 1.58e+2 1.65e+2 -5.26e+5 -5.26e+5 1.00      685.
 2 lambda[1]   1.39e-1  8.41e-2 1.60e-1 9.38e-2  5.05e-3  4.74e-1 0.999    2936.
 3 lambda[2]   3.32e-1  2.45e-1 2.87e-1 2.10e-1  4.10e-2  8.84e-1 1.00     3632.
 4 lambda[3]   9.16e-2  8.56e-2 4.09e-2 3.90e-2  3.57e-2  1.67e-1 1.00     4681.
 5 lambda[4]   7.17e-2  6.52e-2 3.66e-2 3.31e-2  2.31e-2  1.42e-1 1.00     3593.
 6 lambda[5]   1.78e-1  1.71e-1 6.03e-2 5.65e-2  9.09e-2  2.89e-1 1.00     4086.
 7 lambda[6]   1.41e-1  8.47e-2 1.72e-1 9.44e-2  5.55e-3  4.64e-1 1.00     3019.
 8 lambda[7]   3.87e-2  3.14e-2 2.83e-2 2.44e-2  6.68e-3  9.16e-2 1.00     3098.
 9 lambda[8]   1.25e-1  1.20e-1 4.62e-2 4.74e-2  6.04e-2  2.11e-1 1.00     3931.
10 lambda[9]   6.62e-2  5.44e-2 4.83e-2 4.00e-2  1.13e-2  1.62e-1 1.00     3758.
# ℹ 70,701 more rows
# ℹ 1 more variable: ess_tail <num>

We have some basic HMC-based validity statistics we can check.

Show code
pnbd_cdnow_fixed3_stanfit$cmdstan_diagnose()
Processing csv files: /home/rstudio/btydwork/stan_models/fit_pnbd_cdnow_fixed3-1.csvWarning: non-fatal error reading adaptation data
, /home/rstudio/btydwork/stan_models/fit_pnbd_cdnow_fixed3-2.csvWarning: non-fatal error reading adaptation data
, /home/rstudio/btydwork/stan_models/fit_pnbd_cdnow_fixed3-3.csvWarning: non-fatal error reading adaptation data
, /home/rstudio/btydwork/stan_models/fit_pnbd_cdnow_fixed3-4.csvWarning: non-fatal error reading adaptation data


Checking sampler transitions treedepth.
Treedepth satisfactory for all transitions.

Checking sampler transitions for divergences.
No divergent transitions found.

Checking E-BFMI - sampler transitions HMC potential energy.
E-BFMI satisfactory.

Effective sample size satisfactory.

Split R-hat values satisfactory all parameters.

Processing complete, no problems detected.

4.1 Visual Diagnostics of the Sample Validity

Now that we have a sample from the posterior distribution we need to create a few different visualisations of the diagnostics.

Show code
parameter_subset <- c(
  "lambda[1]", "lambda[2]", "lambda[3]", "lambda[4]",
  "mu[1]",     "mu[2]",     "mu[3]",     "mu[4]"
  )

pnbd_cdnow_fixed3_stanfit$draws(inc_warmup = FALSE) |>
  mcmc_trace(pars = parameter_subset) +
  expand_limits(y = 0) +
  labs(
    x = "Iteration",
    y = "Value",
    title = "Traceplot of Sample of Lambda and Mu Values"
    ) +
  theme(axis.text.x = element_text(size = 10))

We want to check the \(N_{eff}\) statistics also.

Show code
pnbd_cdnow_fixed3_stanfit |>
  neff_ratio(pars = c("lambda", "mu")) |>
  as.numeric() |>
  mcmc_neff() +
    ggtitle("Plot of Parameter Effective Sample Sizes")

4.2 Assess the Model

As we intend to run the same logic to assess each of our models, we have combined all this logic into a single function run_model_assessment, to run the simulations and combine the datasets.

We first run the assessment data.

Show code
pnbd_stanfit <- pnbd_cdnow_fixed3_stanfit |>
  recover_types(customer_fit_stats_tbl)

pnbd_cdnow_fixed3_assess_data_lst <- run_model_assessment(
  model_stanfit       = pnbd_stanfit,
  insample_tbl        = customer_fit_subset_tbl,
  fit_label           = "pnbd_cdnow_fixed3",
  fit_end_dttm        = use_fit_end_date     |> as.POSIXct(),
  valid_start_dttm    = use_valid_start_date |> as.POSIXct(),
  valid_end_dttm      = use_valid_end_date   |> as.POSIXct(),
  precompute_rootdir  = "precompute",
  data_dir            = "data",
  summary_include_tnx = FALSE,
  sim_seed            = 30
  )

pnbd_cdnow_fixed3_assess_data_lst |> glimpse()
List of 5
 $ model_fit_index_filepath     : 'glue' chr "data/pnbd_cdnow_fixed3_assess_fit_index_tbl.rds"
 $ model_valid_index_filepath   : 'glue' chr "data/pnbd_cdnow_fixed3_assess_valid_index_tbl.rds"
 $ model_simstats_filepath      : 'glue' chr "data/pnbd_cdnow_fixed3_assess_model_simstats_tbl.rds"
 $ model_fit_simstats_filepath  : 'glue' chr "data/pnbd_cdnow_fixed3_assess_fit_simstats_tbl.rds"
 $ model_valid_simstats_filepath: 'glue' chr "data/pnbd_cdnow_fixed3_assess_valid_simstats_tbl.rds"

4.2.1 Check In-Sample Data Validation

We first check the model against the in-sample data.

Show code
simdata_tbl <- pnbd_cdnow_fixed3_assess_data_lst |>
  use_series(model_fit_simstats_filepath) |>
  read_rds()

insample_plots_lst <- create_model_assessment_plots(
  obsdata_tbl = customer_fit_subset_tbl,
  simdata_tbl = simdata_tbl
  )

insample_plots_lst$multi_plot |> print()

Show code
insample_plots_lst$total_plot |> print()

Show code
insample_plots_lst$quant_plot |> print()

This fit looks reasonable and appears to capture most of the aspects of the data used to fit it. Given that this is a synthetic dataset, this is not surprising, but at least we appreciate that our model is valid.

4.2.2 Check Out-of-Sample Data Validation

We now repeat for the out-of-sample data.

Show code
simdata_tbl <- pnbd_cdnow_fixed3_assess_data_lst |>
  use_series(model_valid_simstats_filepath) |>
  read_rds()

outsample_plots_lst <- create_model_assessment_plots(
  obsdata_tbl = customer_valid_subset_tbl,
  simdata_tbl = simdata_tbl
  )

outsample_plots_lst$multi_plot |> print()

Show code
outsample_plots_lst$total_plot |> print()

Show code
outsample_plots_lst$quant_plot |> print()

5 Fit Narrow-Short-Lifetime Model

We now want to try a model where we use priors with a tighter coefficient of variation for lifetime but keep the CoV for transaction frequency.

Show code
stan_modelname <- "pnbd_cdnow_fixed4"
stanfit_seed   <- stanfit_seed + 1
stanfit_prefix <- str_c("fit_", stan_modelname) 

stanfit_object_file <- glue("data/{stanfit_prefix}_stanfit.rds")


stan_data_lst <- customer_fit_stats_tbl |>
  select(customer_id, x, t_x, T_cal) |>
  compose_data(
    lambda_mn = 0.25,
    lambda_cv = 1.00,
    
    mu_mn     = 0.20,
    mu_cv     = 0.30,
    )

if(!file_exists(stanfit_object_file)) {
  pnbd_cdnow_fixed4_stanfit <- pnbd_fixed_stanmodel$sample(
    data            =                stan_data_lst,
    chains          =                            4,
    iter_warmup     =                          500,
    iter_sampling   =                          500,
    seed            =                 stanfit_seed,
    save_warmup     =                         TRUE,
    output_dir      =                stan_modeldir,
    output_basename =               stanfit_prefix,
    )

  pnbd_cdnow_fixed4_stanfit$save_object(stanfit_object_file, compress = "gzip")

} else {
  pnbd_cdnow_fixed4_stanfit <- read_rds(stanfit_object_file)
}

pnbd_cdnow_fixed4_stanfit$summary()
# A tibble: 70,711 × 10
   variable       mean   median      sd     mad       q5      q95  rhat ess_bulk
   <chr>         <num>    <num>   <num>   <num>    <num>    <num> <num>    <num>
 1 lp__       -9.13e+5 -9.13e+5 1.57e+2 1.54e+2 -9.13e+5 -9.13e+5  1.00     783.
 2 lambda[1]   1.56e-1  1.02e-1 1.73e-1 1.08e-1  6.39e-3  5.04e-1  1.00    2167.
 3 lambda[2]   3.48e-1  2.77e-1 2.68e-1 2.19e-1  5.61e-2  8.64e-1  1.00    3658.
 4 lambda[3]   9.26e-2  8.69e-2 4.20e-2 4.07e-2  3.70e-2  1.68e-1  1.00    3527.
 5 lambda[4]   7.30e-2  6.77e-2 3.59e-2 3.31e-2  2.51e-2  1.40e-1  1.01    2724.
 6 lambda[5]   1.80e-1  1.73e-1 5.73e-2 5.51e-2  9.66e-2  2.84e-1  1.00    3397.
 7 lambda[6]   1.63e-1  1.05e-1 1.78e-1 1.14e-1  5.50e-3  5.23e-1  1.00    1715.
 8 lambda[7]   4.08e-2  3.37e-2 3.00e-2 2.33e-2  7.68e-3  9.97e-2  1.00    2705.
 9 lambda[8]   1.25e-1  1.18e-1 4.78e-2 4.65e-2  5.85e-2  2.12e-1  1.00    3133.
10 lambda[9]   7.40e-2  6.34e-2 5.15e-2 4.42e-2  1.39e-2  1.73e-1  1.00    2791.
# ℹ 70,701 more rows
# ℹ 1 more variable: ess_tail <num>

We have some basic HMC-based validity statistics we can check.

Show code
pnbd_cdnow_fixed4_stanfit$cmdstan_diagnose()
Processing csv files: /home/rstudio/btydwork/stan_models/fit_pnbd_cdnow_fixed4-1.csvWarning: non-fatal error reading adaptation data
, /home/rstudio/btydwork/stan_models/fit_pnbd_cdnow_fixed4-2.csvWarning: non-fatal error reading adaptation data
, /home/rstudio/btydwork/stan_models/fit_pnbd_cdnow_fixed4-3.csvWarning: non-fatal error reading adaptation data
, /home/rstudio/btydwork/stan_models/fit_pnbd_cdnow_fixed4-4.csvWarning: non-fatal error reading adaptation data


Checking sampler transitions treedepth.
Treedepth satisfactory for all transitions.

Checking sampler transitions for divergences.
No divergent transitions found.

Checking E-BFMI - sampler transitions HMC potential energy.
E-BFMI satisfactory.

Effective sample size satisfactory.

Split R-hat values satisfactory all parameters.

Processing complete, no problems detected.

5.1 Visual Diagnostics of the Sample Validity

Now that we have a sample from the posterior distribution we need to create a few different visualisations of the diagnostics.

Show code
parameter_subset <- c(
  "lambda[1]", "lambda[2]", "lambda[3]", "lambda[4]",
  "mu[1]",     "mu[2]",     "mu[3]",     "mu[4]"
  )

pnbd_cdnow_fixed4_stanfit$draws(inc_warmup = FALSE) |>
  mcmc_trace(pars = parameter_subset) +
  expand_limits(y = 0) +
  labs(
    x = "Iteration",
    y = "Value",
    title = "Traceplot of Sample of Lambda and Mu Values"
    ) +
  theme(axis.text.x = element_text(size = 10))

We want to check the \(N_{eff}\) statistics also.

Show code
pnbd_cdnow_fixed4_stanfit |>
  neff_ratio(pars = c("lambda", "mu")) |>
  as.numeric() |>
  mcmc_neff() +
    ggtitle("Plot of Parameter Effective Sample Sizes")

5.2 Assess the Model

As we intend to run the same logic to assess each of our models, we have combined all this logic into a single function run_model_assessment, to run the simulations and combine the datasets.

We first run the assessment data.

Show code
pnbd_stanfit <- pnbd_cdnow_fixed4_stanfit |>
  recover_types(customer_fit_stats_tbl)

pnbd_cdnow_fixed4_assess_data_lst <- run_model_assessment(
  model_stanfit       = pnbd_stanfit,
  insample_tbl        = customer_fit_subset_tbl,
  fit_label           = "pnbd_cdnow_fixed4",
  fit_end_dttm        = use_fit_end_date     |> as.POSIXct(),
  valid_start_dttm    = use_valid_start_date |> as.POSIXct(),
  valid_end_dttm      = use_valid_end_date   |> as.POSIXct(),
  precompute_rootdir  = "precompute",
  data_dir            = "data",
  summary_include_tnx = FALSE,
  sim_seed            = 40
  )

pnbd_cdnow_fixed4_assess_data_lst |> glimpse()
List of 5
 $ model_fit_index_filepath     : 'glue' chr "data/pnbd_cdnow_fixed4_assess_fit_index_tbl.rds"
 $ model_valid_index_filepath   : 'glue' chr "data/pnbd_cdnow_fixed4_assess_valid_index_tbl.rds"
 $ model_simstats_filepath      : 'glue' chr "data/pnbd_cdnow_fixed4_assess_model_simstats_tbl.rds"
 $ model_fit_simstats_filepath  : 'glue' chr "data/pnbd_cdnow_fixed4_assess_fit_simstats_tbl.rds"
 $ model_valid_simstats_filepath: 'glue' chr "data/pnbd_cdnow_fixed4_assess_valid_simstats_tbl.rds"

5.2.1 Check In-Sample Data Validation

We first check the model against the in-sample data.

Show code
simdata_tbl <- pnbd_cdnow_fixed4_assess_data_lst |>
  use_series(model_fit_simstats_filepath) |>
  read_rds()

insample_plots_lst <- create_model_assessment_plots(
  obsdata_tbl = customer_fit_subset_tbl,
  simdata_tbl = simdata_tbl
  )

insample_plots_lst$multi_plot |> print()

Show code
insample_plots_lst$total_plot |> print()

Show code
insample_plots_lst$quant_plot |> print()

This fit looks reasonable and appears to capture most of the aspects of the data used to fit it. Given that this is a synthetic dataset, this is not surprising, but at least we appreciate that our model is valid.

5.2.2 Check Out-of-Sample Data Validation

We now repeat for the out-of-sample data.

Show code
simdata_tbl <- pnbd_cdnow_fixed4_assess_data_lst |>
  use_series(model_valid_simstats_filepath) |>
  read_rds()

outsample_plots_lst <- create_model_assessment_plots(
  obsdata_tbl = customer_valid_subset_tbl,
  simdata_tbl = simdata_tbl
  )

outsample_plots_lst$multi_plot |> print()

Show code
outsample_plots_lst$total_plot |> print()

Show code
outsample_plots_lst$quant_plot |> print()

6 Compare Model Outputs

We have looked at each of the models individually, but it is also worth looking at each of the models as a group.

Show code
calculate_simulation_statistics <- function(file_rds) {
  simdata_tbl <- read_rds(file_rds)
  
  multicount_cust_tbl <- simdata_tbl |>
    filter(sim_tnx_count > 0) |>
    count(draw_id, name = "multicust_count")
  
  totaltnx_data_tbl <- simdata_tbl |>
    count(draw_id, wt = sim_tnx_count, name = "simtnx_count")
  
  simstats_tbl <- multicount_cust_tbl |>
    inner_join(totaltnx_data_tbl, by = "draw_id")
  
  return(simstats_tbl)
}
Show code
obs_fit_customer_count <- customer_fit_subset_tbl |>
  filter(tnx_count > 0) |>
  nrow()

obs_valid_customer_count <- customer_valid_subset_tbl |>
  filter(tnx_count > 0) |>
  nrow()

obs_fit_total_count <- customer_fit_subset_tbl |>
  pull(tnx_count) |>
  sum()

obs_valid_total_count <- customer_valid_subset_tbl |>
  pull(tnx_count) |>
  sum()


obs_stats_tbl <- tribble(
  ~assess_type, ~name,               ~obs_value,
  "fit",        "multicust_count",   obs_fit_customer_count,
  "fit",        "simtnx_count",      obs_fit_total_count,
  "valid",      "multicust_count",   obs_valid_customer_count,
  "valid",      "simtnx_count",      obs_valid_total_count
  )


model_assess_tbl <- dir_ls("data", regexp = "pnbd_cdnow_fixed.*_assess_.*simstats") |>
  enframe(name = NULL, value = "file_path") |>
  filter(str_detect(file_path, "_assess_model_", negate = TRUE)) |>
  mutate(
    model_label = str_replace(file_path, "data/pnbd_cdnow_(.*?)_assess_.*", "\\1"),
    assess_type = if_else(str_detect(file_path, "_assess_fit_"), "fit", "valid"),
    
    sim_data = map(
      file_path, calculate_simulation_statistics,
      
      .progress = "calculate_simulation_statistics"
      )
    )

model_assess_tbl |> glimpse()
Rows: 8
Columns: 4
$ file_path   <fs::path> "data/pnbd_cdnow_fixed1_assess_fit_simstats_tbl.rds",…
$ model_label <chr> "fixed1", "fixed1", "fixed2", "fixed2", "fixed3", "fixed3"…
$ assess_type <chr> "fit", "valid", "fit", "valid", "fit", "valid", "fit", "va…
$ sim_data    <list> [<tbl_df[2000 x 3]>], [<tbl_df[2000 x 3]>], [<tbl_df[2000…
Show code
model_assess_summstat_tbl <- model_assess_tbl |>
  select(model_label, assess_type, sim_data) |>
  unnest(sim_data) |>
  pivot_longer(
    cols = !c(model_label, assess_type, draw_id)
    ) |>
  group_by(model_label, assess_type, name) |>
  summarise(
    .groups = "drop",
    
    mean_val = mean(value),
    p10 = quantile(value, 0.10),
    p25 = quantile(value, 0.25),
    p50 = quantile(value, 0.50),
    p75 = quantile(value, 0.75),
    p90 = quantile(value, 0.90)
    )

model_assess_summstat_tbl |> glimpse()
Rows: 16
Columns: 9
$ model_label <chr> "fixed1", "fixed1", "fixed1", "fixed1", "fixed2", "fixed2"…
$ assess_type <chr> "fit", "fit", "valid", "valid", "fit", "fit", "valid", "va…
$ name        <chr> "multicust_count", "simtnx_count", "multicust_count", "sim…
$ mean_val    <dbl> 1098.7000, 4283.9665, 297.7140, 914.8950, 1299.9255, 4667.…
$ p10         <dbl> 1072.0, 4090.0, 283.0, 854.0, 1272.0, 4484.9, 181.0, 548.0…
$ p25         <dbl> 1084.00, 4181.00, 290.00, 882.00, 1285.75, 4570.00, 187.00…
$ p50         <dbl> 1099.0, 4277.0, 297.0, 915.0, 1300.0, 4663.0, 194.0, 605.0…
$ p75         <dbl> 1114.00, 4384.25, 306.00, 948.00, 1314.00, 4765.00, 201.00…
$ p90         <dbl> 1126.0, 4479.0, 313.0, 979.0, 1328.0, 4850.0, 207.0, 662.0…
Show code
#! echo: TRUE

ggplot(model_assess_summstat_tbl) +
  geom_errorbar(
    aes(x = model_label, ymin = p10, ymax = p90), width = 0
    ) +
  geom_errorbar(
    aes(x = model_label, ymin = p25, ymax = p75), width = 0, linewidth = 3
    ) +
  geom_hline(
    aes(yintercept = obs_value),
    data = obs_stats_tbl, colour = "red"
    ) +
  scale_y_continuous(labels = label_comma()) +
  expand_limits(y = 0) +
  facet_wrap(
    vars(assess_type, name), scale = "free_y"
    ) +
  labs(
    x = "Model",
    y = "Count",
    title = "Comparison Plot for the Different Models"
    ) +
  theme(
    axis.text.x = element_text(angle = 20, vjust = 0.5, size = 8)
    )

6.1 Write Assessment Data to Disk

We now want to save the assessment data to disk.

Show code
model_assess_tbl |> write_rds("data/assess_data_pnbd_cdnow_fixed_tbl.rds")

7 R Environment

Show code
options(width = 120L)
sessioninfo::session_info()
─ Session info ───────────────────────────────────────────────────────────────────────────────────────────────────────
 setting  value
 version  R version 4.2.3 (2023-03-15)
 os       Ubuntu 22.04.2 LTS
 system   x86_64, linux-gnu
 ui       X11
 language (EN)
 collate  en_US.UTF-8
 ctype    en_US.UTF-8
 tz       Europe/Dublin
 date     2023-07-16
 pandoc   2.19.2 @ /usr/local/bin/ (via rmarkdown)

─ Packages ───────────────────────────────────────────────────────────────────────────────────────────────────────────
 package        * version   date (UTC) lib source
 abind            1.4-5     2016-07-21 [1] RSPM (R 4.2.0)
 arrayhelpers     1.1-0     2020-02-04 [1] RSPM (R 4.2.0)
 backports        1.4.1     2021-12-13 [1] RSPM (R 4.2.0)
 base64enc        0.1-3     2015-07-28 [1] RSPM (R 4.2.0)
 bayesplot      * 1.10.0    2022-11-16 [1] RSPM (R 4.2.0)
 bit              4.0.5     2022-11-15 [1] RSPM (R 4.2.0)
 bit64            4.0.5     2020-08-30 [1] RSPM (R 4.2.0)
 boot             1.3-28.1  2022-11-22 [2] CRAN (R 4.2.3)
 bridgesampling   1.1-2     2021-04-16 [1] RSPM (R 4.2.0)
 brms           * 2.19.0    2023-03-14 [1] RSPM (R 4.2.0)
 Brobdingnag      1.2-9     2022-10-19 [1] RSPM (R 4.2.0)
 cachem           1.0.7     2023-02-24 [1] RSPM (R 4.2.0)
 callr            3.7.3     2022-11-02 [1] RSPM (R 4.2.0)
 checkmate        2.1.0     2022-04-21 [1] RSPM (R 4.2.0)
 cli              3.6.1     2023-03-23 [1] RSPM (R 4.2.0)
 cmdstanr       * 0.5.3     2023-06-29 [1] Github (stan-dev/cmdstanr@22b391e)
 coda             0.19-4    2020-09-30 [1] RSPM (R 4.2.0)
 codetools        0.2-19    2023-02-01 [2] CRAN (R 4.2.3)
 colorspace       2.1-0     2023-01-23 [1] RSPM (R 4.2.0)
 colourpicker     1.2.0     2022-10-28 [1] RSPM (R 4.2.0)
 conflicted     * 1.2.0     2023-02-01 [1] RSPM (R 4.2.0)
 cowplot        * 1.1.1     2020-12-30 [1] RSPM (R 4.2.0)
 crayon           1.5.2     2022-09-29 [1] RSPM (R 4.2.0)
 crosstalk        1.2.0     2021-11-04 [1] RSPM (R 4.2.0)
 digest           0.6.31    2022-12-11 [1] RSPM (R 4.2.0)
 directlabels   * 2021.1.13 2021-01-16 [1] RSPM (R 4.2.0)
 distributional   0.3.2     2023-03-22 [1] RSPM (R 4.2.0)
 dplyr          * 1.1.1     2023-03-22 [1] RSPM (R 4.2.0)
 DT               0.27      2023-01-17 [1] RSPM (R 4.2.0)
 dygraphs         1.1.1.6   2018-07-11 [1] RSPM (R 4.2.0)
 ellipsis         0.3.2     2021-04-29 [1] RSPM (R 4.2.0)
 evaluate         0.20      2023-01-17 [1] RSPM (R 4.2.0)
 fansi            1.0.4     2023-01-22 [1] RSPM (R 4.2.0)
 farver           2.1.1     2022-07-06 [1] RSPM (R 4.2.0)
 fastmap          1.1.1     2023-02-24 [1] RSPM (R 4.2.0)
 forcats        * 1.0.0     2023-01-29 [1] RSPM (R 4.2.0)
 fs             * 1.6.1     2023-02-06 [1] RSPM (R 4.2.0)
 furrr          * 0.3.1     2022-08-15 [1] RSPM (R 4.2.0)
 future         * 1.32.0    2023-03-07 [1] RSPM (R 4.2.0)
 gamm4            0.2-6     2020-04-03 [1] RSPM (R 4.2.0)
 generics         0.1.3     2022-07-05 [1] RSPM (R 4.2.0)
 ggdist           3.2.1     2023-01-18 [1] RSPM (R 4.2.0)
 ggplot2        * 3.4.2     2023-04-03 [1] RSPM (R 4.2.0)
 globals          0.16.2    2022-11-21 [1] RSPM (R 4.2.0)
 glue           * 1.6.2     2022-02-24 [1] RSPM (R 4.2.0)
 gridExtra        2.3       2017-09-09 [1] RSPM (R 4.2.0)
 gtable           0.3.3     2023-03-21 [1] RSPM (R 4.2.0)
 gtools           3.9.4     2022-11-27 [1] RSPM (R 4.2.0)
 hms              1.1.3     2023-03-21 [1] RSPM (R 4.2.0)
 htmltools        0.5.5     2023-03-23 [1] RSPM (R 4.2.0)
 htmlwidgets      1.6.2     2023-03-17 [1] RSPM (R 4.2.0)
 httpuv           1.6.9     2023-02-14 [1] RSPM (R 4.2.0)
 igraph           1.4.2     2023-04-07 [1] RSPM (R 4.2.0)
 inline           0.3.19    2021-05-31 [1] RSPM (R 4.2.0)
 jsonlite         1.8.4     2022-12-06 [1] RSPM (R 4.2.0)
 knitr            1.42      2023-01-25 [1] RSPM (R 4.2.0)
 labeling         0.4.2     2020-10-20 [1] RSPM (R 4.2.0)
 later            1.3.0     2021-08-18 [1] RSPM (R 4.2.0)
 lattice          0.20-45   2021-09-22 [2] CRAN (R 4.2.3)
 lifecycle        1.0.3     2022-10-07 [1] RSPM (R 4.2.0)
 listenv          0.9.0     2022-12-16 [1] RSPM (R 4.2.0)
 lme4             1.1-32    2023-03-14 [1] RSPM (R 4.2.0)
 loo              2.6.0     2023-03-31 [1] RSPM (R 4.2.0)
 lubridate      * 1.9.2     2023-02-10 [1] RSPM (R 4.2.0)
 magrittr       * 2.0.3     2022-03-30 [1] RSPM (R 4.2.0)
 markdown         1.6       2023-04-07 [1] RSPM (R 4.2.0)
 MASS             7.3-58.2  2023-01-23 [2] CRAN (R 4.2.3)
 Matrix           1.5-3     2022-11-11 [2] CRAN (R 4.2.3)
 matrixStats      0.63.0    2022-11-18 [1] RSPM (R 4.2.0)
 memoise          2.0.1     2021-11-26 [1] RSPM (R 4.2.0)
 mgcv             1.8-42    2023-03-02 [2] CRAN (R 4.2.3)
 mime             0.12      2021-09-28 [1] RSPM (R 4.2.0)
 miniUI           0.1.1.1   2018-05-18 [1] RSPM (R 4.2.0)
 minqa            1.2.5     2022-10-19 [1] RSPM (R 4.2.0)
 munsell          0.5.0     2018-06-12 [1] RSPM (R 4.2.0)
 mvtnorm          1.1-3     2021-10-08 [1] RSPM (R 4.2.0)
 nlme             3.1-162   2023-01-31 [2] CRAN (R 4.2.3)
 nloptr           2.0.3     2022-05-26 [1] RSPM (R 4.2.0)
 parallelly       1.35.0    2023-03-23 [1] RSPM (R 4.2.0)
 pillar           1.9.0     2023-03-22 [1] RSPM (R 4.2.0)
 pkgbuild         1.4.0     2022-11-27 [1] RSPM (R 4.2.0)
 pkgconfig        2.0.3     2019-09-22 [1] RSPM (R 4.2.0)
 plyr             1.8.8     2022-11-11 [1] RSPM (R 4.2.0)
 posterior      * 1.4.1     2023-03-14 [1] RSPM (R 4.2.0)
 prettyunits      1.1.1     2020-01-24 [1] RSPM (R 4.2.0)
 processx         3.8.1     2023-04-18 [1] RSPM (R 4.2.0)
 projpred         2.5.0     2023-04-05 [1] RSPM (R 4.2.0)
 promises         1.2.0.1   2021-02-11 [1] RSPM (R 4.2.0)
 ps               1.7.5     2023-04-18 [1] RSPM (R 4.2.0)
 purrr          * 1.0.1     2023-01-10 [1] RSPM (R 4.2.0)
 quadprog         1.5-8     2019-11-20 [1] RSPM (R 4.2.0)
 R6               2.5.1     2021-08-19 [1] RSPM (R 4.2.0)
 Rcpp           * 1.0.10    2023-01-22 [1] RSPM (R 4.2.0)
 RcppParallel     5.1.7     2023-02-27 [1] RSPM (R 4.2.0)
 readr          * 2.1.4     2023-02-10 [1] RSPM (R 4.2.0)
 reshape2         1.4.4     2020-04-09 [1] RSPM (R 4.2.0)
 rlang          * 1.1.0     2023-03-14 [1] RSPM (R 4.2.0)
 rmarkdown        2.21      2023-03-26 [1] RSPM (R 4.2.0)
 rstan            2.21.8    2023-01-17 [1] RSPM (R 4.2.0)
 rstantools       2.3.1     2023-03-30 [1] RSPM (R 4.2.0)
 rsyslog        * 1.0.2     2021-06-04 [1] RSPM (R 4.2.0)
 scales         * 1.2.1     2022-08-20 [1] RSPM (R 4.2.0)
 sessioninfo      1.2.2     2021-12-06 [1] RSPM (R 4.2.0)
 shiny            1.7.4     2022-12-15 [1] RSPM (R 4.2.0)
 shinyjs          2.1.0     2021-12-23 [1] RSPM (R 4.2.0)
 shinystan        2.6.0     2022-03-03 [1] RSPM (R 4.2.0)
 shinythemes      1.2.0     2021-01-25 [1] RSPM (R 4.2.0)
 StanHeaders      2.21.0-7  2020-12-17 [1] RSPM (R 4.2.0)
 stringi          1.7.12    2023-01-11 [1] RSPM (R 4.2.0)
 stringr        * 1.5.0     2022-12-02 [1] RSPM (R 4.2.0)
 svUnit           1.0.6     2021-04-19 [1] RSPM (R 4.2.0)
 tensorA          0.36.2    2020-11-19 [1] RSPM (R 4.2.0)
 threejs          0.3.3     2020-01-21 [1] RSPM (R 4.2.0)
 tibble         * 3.2.1     2023-03-20 [1] RSPM (R 4.2.0)
 tidybayes      * 3.0.4     2023-03-14 [1] RSPM (R 4.2.0)
 tidyr          * 1.3.0     2023-01-24 [1] RSPM (R 4.2.0)
 tidyselect       1.2.0     2022-10-10 [1] RSPM (R 4.2.0)
 tidyverse      * 2.0.0     2023-02-22 [1] RSPM (R 4.2.0)
 timechange       0.2.0     2023-01-11 [1] RSPM (R 4.2.0)
 tzdb             0.3.0     2022-03-28 [1] RSPM (R 4.2.0)
 utf8             1.2.3     2023-01-31 [1] RSPM (R 4.2.0)
 vctrs            0.6.2     2023-04-19 [1] RSPM (R 4.2.0)
 vroom            1.6.1     2023-01-22 [1] RSPM (R 4.2.0)
 withr            2.5.0     2022-03-03 [1] RSPM (R 4.2.0)
 xfun             0.38      2023-03-24 [1] RSPM (R 4.2.0)
 xtable           1.8-4     2019-04-21 [1] RSPM (R 4.2.0)
 xts              0.13.1    2023-04-16 [1] RSPM (R 4.2.0)
 yaml             2.3.7     2023-01-23 [1] RSPM (R 4.2.0)
 zoo              1.8-12    2023-04-13 [1] RSPM (R 4.2.0)

 [1] /usr/local/lib/R/site-library
 [2] /usr/local/lib/R/library

──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
Show code
options(width = 80L)